HTTP compression

HTTP
Persistence · Compression · HTTPS
Request methods
OPTIONS · GET · HEAD · POST · PUT · DELETE · TRACE · CONNECT
Header fields
Cookie · ETag · Location · Referer
DNT · X-Forwarded-For
Status codes
301 Moved permanently
302 Found
303 See Other
403 Forbidden
404 Not Found

HTTP compression is a capability that can be built into web servers and web clients to make better use of available bandwidth , and provide faster transmission speeds between both.[1] HTTP data is compressed before it is sent from the server: compliant browsers will announce what methods are supported to the server before downloading the correct format; browsers that do not support compliant compression method will download uncompressed data. The most common compression schemas include gzip and deflate, however a full list of available schemas is maintained by IANA.[2] Additionally, third parties develop new methods and include them in their products (e.g. the Google SDCH schema implemented in Google Chrome browser and used on certain Google servers).

A 2009 article by Google engineers Arvind Jain and Jason Glasgow states that more than 99 person-years are wasted daily due to page load time increases when users do not receive compressed content. This occurs where anti-virus software interferes with connections to force them to uncompressed, where proxies are used (with overcautious web browsers), where servers are misconfigured, and where browser bugs stop compression being used. Internet Explorer 6, which drops to HTTP 1.0 (without features like compression or pipelining) when behind a proxy- a common configuration in corporate environments- was the mainstream browser most prone to failing back to uncompressed HTTP.[3]

Contents

Client/Server compression scheme negotiation

In most cases, excluding the SDCH, the negotiation is done in two steps, described in the RFC 2616:

1. The web client includes an Accept-Encoding field in the HTTP request, with supported compression schema names (called content-coding tokens), separated by commas.

GET /encrypted-area HTTP/1.1
Host: www.example.com
Accept-Encoding: gzip, deflate

2. If the server supports one or more compression schemas, the outgoing data may be compressed by one or more methods supported by both parties. If this is the case, the server will add a Content-Encoding field in the HTTP response with the used schemas, separated by commas.

HTTP/1.1 200 OK
Date: Mon, 23 May 2005 22:38:34 GMT
Server: Apache/1.3.3.7 (Unix)  (Red-Hat/Linux)
Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
Etag: "3f80f-1b6-3e1cb03b"
Accept-Ranges: bytes
Content-Length: 438
Connection: close
Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip

The web server is by no means obligated to use any compression method - this depends on the internal settings of the web server and also may depend on the internal architecture of the website in question.

In case of SDCH a dictionary negotiation is also required, which may involve additional steps, like downloading a proper dictionary from the external server.

Content-coding tokens

Servers that support HTTP compression

The compression in HTTP can also be achieved by using the functionality of server-side scripting languages, like PHP or Java.

References

  1. ^ "Using HTTP Compression (IIS 6.0)". Microsoft Corporation. http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/d52ff289-94d3-4085-bc4e-24eb4f312e0e.mspx?mfr=true. Retrieved 9 February 2010. 
  2. ^ RFC 2616, Section 3.5: "The Internet Assigned Numbers Authority (IANA) acts as a registry for content-coding value tokens."
  3. ^ http://code.google.com/speed/articles/use-compression.html
  4. ^ "Compression Tests". Verve Studios, Co. http://www.vervestudios.co/projects/compression-tests/. Retrieved 23 March 2011. 
  5. ^ "Frequently Asked Questions about zlib - What's the difference between the "gzip" and "deflate" HTTP 1.1 encodings?". Greg Roelofs, Jean-loup Gailly and Mark Adler. http://www.gzip.org/zlib/zlib_faq.html#faq38. Retrieved 23 March 2011. 
  6. ^ "Compression Tests: Results". Verve Studios, Co. http://www.vervestudios.co/projects/compression-tests/results. Retrieved 23 March 2011. 
  7. ^ JSR 200: Network Transfer Format for Java Archives.
  8. ^ "Compression Tests". Verve Studios, Co. http://www.vervestudios.co/projects/compression-tests/. Retrieved 23 March 2011. 
  9. ^ "HOWTO: Use Apache mod_deflate To Compress Web Content (Accept-Encoding: gzip) - Mark S. Kolich". Mark S. Kolich. http://mark.koli.ch/2009/04/howto-use-apache-mod-deflate-to-compress-web-content-obsessed-with-speed-of-kolichcommobi.html. Retrieved 23 March 2011. 

External links